8 research outputs found

    Statistical models and decision making for robotic scientific information gathering

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution September 2018.Mobile robots and autonomous sensors have seen increasing use in scientific applications, from planetary rovers surveying for signs of life on Mars, to environmental buoys measuring and logging oceanographic conditions in coastal regions. This thesis makes contributions in both planning algorithms and model design for autonomous scientific information gathering, demonstrating how theory from machine learning, decision theory, theory of optimal experimental design, and statistical inference can be used to develop online algorithms for robotic information gathering that are robust to modeling errors, account for spatiotemporal structure in scientific data, and have probabilistic performance guarantees. This thesis first introduces a novel sample selection algorithm for online, irrevocable sampling in data streams that have spatiotemporal structure, such as those that commonly arise in robotics and environmental monitoring. Given a limited sampling capacity, the proposed periodic secretary algorithm uses an information-theoretic reward function to select samples in real-time that maximally reduce posterior uncertainty in a given scientific model. Additionally, we provide a lower bound on the quality of samples selected by the periodic secretary algorithm by leveraging the submodularity of the information-theoretic reward function. Finally, we demonstrate the robustness of the proposed approach by employing the periodic secretary algorithm to select samples irrevocably from a seven-year oceanographic data stream collected at the Martha’s Vineyard Coastal Observatory off the coast of Cape Cod, USA. Secondly, we consider how scientific models can be specified in environments – such as the deep sea or deep space – where domain scientists may not have enough a priori knowledge to formulate a formal scientific model and hypothesis. These domains require scientific models that start with very little prior information and construct a model of the environment online as observations are gathered. We propose unsupervised machine learning as a technique for science model-learning in these environments. To this end, we introduce a hybrid Bayesian-deep learning model that learns a nonparametric topic model of a visual environment. We use this semantic visual model to identify observations that are poorly explained in the current model, and show experimentally that these highly perplexing observations often correspond to scientifically interesting phenomena. On a marine dataset collected by the SeaBED AUV on the Hannibal Sea Mount, images of high perplexity in the learned model corresponded, for example, to a scientifically novel crab congregation in the deep sea. The approaches presented in this thesis capture the depth and breadth of the problems facing the field of autonomous science. Developing robust autonomous systems that enhance our ability to perform exploratory science in environments such as the oceans, deep space, agricultural and disaster-relief zones will require insight and techniques from classical areas of robotics, such as motion and path planning, mapping, and localization, and from other domains, including machine learning, spatial statistics, optimization, and theory of experimental design. This thesis demonstrates how theory and practice from these diverse disciplines can be unified to address problems in autonomous scientific information gathering

    Statistical models and decision making for robotic scientific information gathering

    Get PDF
    Thesis: S.M., Joint Program in Applied Ocean Physics and Engineering (Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science; and the Woods Hole Oceanographic Institution), 2018.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages 97-107).Mobile robots and autonomous sensors have seen increasing use in scientific applications, from planetary rovers surveying for signs of life on Mars, to environmental buoys measuring and logging oceanographic conditions in coastal regions. This thesis makes contributions in both planning algorithms and model design for autonomous scientific information gathering, demonstrating how theory from machine learning, decision theory, theory of optimal experimental design, and statistical inference can be used to develop online algorithms for robotic information gathering that are robust to modeling errors, account for spatiotemporal structure in scientific data, and have probabilistic performance guarantees. This thesis first introduces a novel sample selection algorithm for online, irrevocable sampling in data streams that have spatiotemporal structure, such as those that commonly arise in robotics and environmental monitoring. Given a limited sampling capacity, the proposed periodic secretary algorithm uses an information-theoretic reward function to select samples in real-time that maximally reduce posterior uncertainty in a given scientific model. Additionally, we provide a lower bound on the quality of samples selected by the periodic secretary algorithm by leveraging the submodularity of the information-theoretic reward function. Finally, we demonstrate the robustness of the proposed approach by employing the periodic secretary algorithm to select samples irrevocably from a seven-year oceanographic data stream collected at the Martha's Vineyard Coastal Observatory off the coast of Cape Cod, USA. Secondly, we consider how scientific models can be specified in environments - such as the deep sea or deep space - where domain scientists may not have enough a priori knowledge to formulate a formal scientific model and hypothesis. These domains require scientific models that start with very little prior information and construct a model of the environment online as observations are gathered. We propose unsupervised machine learning as a technique for science model-learning in these environments. To this end, we introduce a hybrid Bayesian-deep learning model that learns a nonparametric topic model of a visual environment. We use this semantic visual model to identify observations that are poorly explained in the current model, and show experimentally that these highly perplexing observations often correspond to scientifically interesting phenomena. On a marine dataset collected by the SeaBED AUV on the Hannibal Sea Mount, images of high perplexity in the learned model corresponded, for example, to a scientifically novel crab congregation in the deep sea. The approaches presented in this thesis capture the depth and breadth of the problems facing the field of autonomous science. Developing robust autonomous systems that enhance our ability to perform exploratory science in environments such as the oceans, deep space, agricultural and disaster-relief zones will require insight and techniques from classical areas of robotics, such as motion and path planning, mapping, and localization, and from other domains, including machine learning, spatial statistics, optimization, and theory of experimental design. This thesis demonstrates how theory and practice from these diverse disciplines can be unified to address problems in autonomous scientific information gathering.by Genevieve Elaine Flaspohler.S.M

    Balancing exploration and exploitation: task-targeted exploration for scientific decision-making

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution September 2022.How do we collect observational data that reveal fundamental properties of scientific phenomena? This is a key challenge in modern scientific discovery. Scientific phenomena are complex—they have high-dimensional and continuous state, exhibit chaotic dynamics, and generate noisy sensor observations. Additionally, scientific experimentation often requires significant time, money, and human effort. In the face of these challenges, we propose to leverage autonomous decision-making to augment and accelerate human scientific discovery. Autonomous decision-making in scientific domains faces an important and classical challenge: balancing exploration and exploitation when making decisions under uncertainty. This thesis argues that efficient decision-making in real-world, scientific domains requires task-targeted exploration—exploration strategies that are tuned to a specific task. By quantifying the change in task performance due to exploratory actions, we enable decision-makers that can contend with highly uncertain real-world environments, performing exploration parsimoniously to improve task performance. The thesis presents three novel paradigms for task-targeted exploration that are motivated by and applied to real-world scientific problems. We first consider exploration in partially observable Markov decision processes (POMDPs) and present two novel planners that leverage task-driven information measures to balance exploration and exploitation. These planners drive robots in simulation and oceanographic field trials to robustly identify plume sources and track targets with stochastic dynamics. We next consider the exploration- exploitation trade-off in online learning paradigms, a robust alternative to POMDPs when the environment is adversarial or difficult to model. We present novel online learning algorithms that balance exploitative and exploratory plays optimally under real-world constraints, including delayed feedback, partial predictability, and short regret horizons. We use these algorithms to perform model selection for subseasonal temperature and precipitation forecasting, achieving state-of-the-art forecasting accuracy. The human scientific endeavor is poised to benefit from our emerging capacity to integrate observational data into the process of model development and validation. Realizing the full potential of these data requires autonomous decision-makers that can contend with the inherent uncertainty of real-world scientific domains. This thesis highlights the critical role that task-targeted exploration plays in efficient scientific decision-making and proposes three novel methods to achieve task-targeted exploration in real-world oceanographic and climate science applications.This material is based upon work supported by the NSF Graduate Research Fellowship Program and a Microsoft Research PhD Fellowship, as well as the Department of Energy / National Nuclear Security Administration under Award Number DE-NA0003921, the Office of Naval Research under Award Number N00014-17-1-2072, and DARPA under Award Number HR001120C0033

    Quantifying the swimming gaits of veined squid (Loligo forbesi) using bio-logging tags

    Get PDF
    Author Posting. © Company of Biologists, 2019. This article is posted here by permission of Company of Biologists for personal use, not for redistribution. The definitive version was published in Journal of Experimental Biology 222 (2019):jeb.198226, doi: 10.1242/jeb.198226.Squid are mobile, diverse, ecologically important marine organisms whose behavior and habitat use can have substantial impacts on ecosystems and fisheries. However, as a consequence in part of the inherent challenges of monitoring squid in their natural marine environment, fine-scale behavioral observations of these free-swimming, soft-bodied animals are rare. Bio-logging tags provide an emerging way to remotely study squid behavior in their natural environments. Here, we applied a novel, high-resolution bio-logging tag (ITAG) to seven veined squid, Loligo forbesii, in a controlled experimental environment to quantify their short-term (24 h) behavioral patterns. Tag accelerometer, magnetometer and pressure data were used to develop automated gait classification algorithms based on overall dynamic body acceleration, and a subset of the events were assessed and confirmed using concurrently collected video data. Finning, flapping and jetting gaits were observed, with the low-acceleration finning gaits detected most often. The animals routinely used a finning gait to ascend (climb) and then glide during descent with fins extended in the tank's water column, a possible strategy to improve swimming efficiency for these negatively buoyant animals. Arms- and mantle-first directional swimming were observed in approximately equal proportions, and the squid were slightly but significantly more active at night. These tag-based observations are novel for squid and indicate a more efficient mode of movement than suggested by some previous observations. The combination of sensing, classification and estimation developed and applied here will enable the quantification of squid activity patterns in the wild to provide new biological information, such as in situ identification of behavioral states, temporal patterns, habitat requirements, energy expenditure and interactions of squid through space–time in the wild.This work was supported by Woods Hole Oceanographic Institution’s Ocean Life Institute and the Innovative Technology Program, Hopkins Marine Station’s Marine Life Observatory (to K.K.), as well as the National Science Foundation Program for Instrument Development for Biological Research (award no. 1455593 to T.A.M., K.K. and K.A.S.). F.C. thanks the Presidentís International Fellowship Initiative (PIFI) of the Chinese Academy of Science. G.E.F. thanks the National Science Foundation GRFP and National Science Foundation REU programs for support of this research.2020-10-2

    Near-optimal irrevocable sample selection for periodic data streams with applications to marine robotics

    No full text
    We consider the task of monitoring spatiotemporal phenomena in real-time by deploying limited sampling resources at locations of interest irrevocably and without knowledge of future observations. This task can be modeled as an instance of the classical secretary problem. Although this problem has been studied extensively in theoretical domains, existing algorithms require that data arrive in random order to provide performance guarantees. These algorithms will perform arbitrarily poorly on data streams such as those encountered in robotics and environmental monitoring domains, which tend to have spatiotemporal structure. We focus on the problem of selecting representative samples from phenomena with periodic structure and introduce a novel sample selection algorithm that recovers a near-optimal sample set according to any monotone submodular utility function. We evaluate our algorithm on a seven-year environmental dataset collected at the Martha’s Vineyard Coastal Observatory and show that it selects phytoplankton sample locations that are nearly optimal in an information-theoretic sense for predicting phytoplankton concentrations in locations that were not directly sampled. The proposed periodic secretary algorithm can be used with theoretical performance guarantees in many real-time sensing and robotics applications for streaming, irrevocable sample selection from periodic data streams

    Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic models

    No full text
    © 2017 IEEE. The gap between our ability to collect interesting data and our ability to analyze these data is growing at an unprecedented rate. Recent algorithmic attempts to fill this gap have employed unsupervised tools to discover structure in data. Some of the most successful approaches have used probabilistic models to uncover latent thematic structure in discrete data. Despite the success of these models on textual data, they have not generalized as well to image data, in part because of the spatial and temporal structure that may exist in an image stream. We introduce a novel unsupervised machine learning framework that incorporates the ability of convolutional autoencoders to discover features from images that directly encode spatial information, within a Bayesian nonparametric topic model that discovers meaningful latent patterns within discrete data. By using this hybrid framework, we overcome the fundamental dependency of traditional topic models on rigidly hand-coded data representations, while simultaneously encoding spatial dependency in our topics without adding model complexity. We apply this model to the motivating application of high-level scene understanding and mission summarization for exploratory marine robots. Our experiments on a seafloor dataset collected by a marine robot show that the proposed hybrid framework outperforms current state-of-the-art approaches on the task of unsupervised seafloor terrain characterization.NSF Graduate Research Fellowship Program awardThe John P. Chase Memorial Endowed Fun

    Approximate Distributed Spatiotemporal Topic Models for Multi-Robot Terrain Characterization

    No full text
    Unsupervised learning techniques, such as Bayesian topic models, are capable of discovering latent structure directly from raw data. These unsupervised models can endow robots with the ability to learn from their observations without human supervision, and then use the learned models for tasks such as autonomous exploration, adaptive sampling, or surveillance. This paper extends single-robot topic models to the domain of multiple robots. The main difficulty of this extension lies in achieving and maintaining global consensus among the unsupervised models learned locally by each robot. This is especially challenging for multi-robot teams operating in communication-constrained environments, such as marine robots. We present a novel approach for multi-robot distributed learning in which each robot maintains a local topic model to categorize its observations and model parameters are shared to achieve global consensus. We apply a combinatorial optimization procedure that combines local robot topic distributions into a globally consistent model based on topic similarity, which we find mitigates topic drift when compared to a baseline approach that matches topics naĂŻvely, We evaluate our methods experimentally by demonstrating multi-robot underwater terrain characterization using simulated missions on real seabed imagery. Our proposed method achieves similar model quality under bandwidth-constraints to that achieved by models that continuously communicate, despite requiring less than one percent of the data transmission needed for continuous communication.National Science Foundation (Award 1734400

    Discovering hydrothermalism from afar: in situ methane instrumentation and change-point detection for decision-making

    No full text
    Seafloor hydrothermalism plays a critical role in fundamental interactions between geochemical and biological processes in the deep ocean. A significant number of hydrothermal vents are hypothesized to exist, but many of these remain undiscovered due in part to the difficulty of detecting hydrothermalism using standard sensors on rosettes towed in the water column or robotic platforms performing surveys. Here, we use in situ methane sensors to complement standard sensing technology for hydrothermalism discovery and compare sensing equipment on a towed rosette and autonomous underwater vehicle (AUV) during a 17 km long transect in the Northern Guaymas Basin. This transect spatially intersected with a known hydrothermally active venting site. These data show that methane signaled possible hydrothermal activity 1.5-3 km laterally (100-150m vertically) from a known vent. Methane as a signal for hydrothermalism performed similarly to standard turbidity sensors (plume detection 2.2-3.3 km from reference source), and more sensitively and clearly than temperature, salinity, and oxygen instruments which readily respond to physical mixing in background seawater. We additionally introduce change-point detection algorithms---streaming cross-correlation and regime identification---as a means of real-time hydrothermalism discovery and discuss related data monitoring technologies that could be used in planning, executing, and monitoring explorative surveys for hydrothermalism.NSF OCE OTIC: #1842053 Woods Hole Oceanographic Institution: Innovative Technology Award NOAA Ocean Exploration: #NA18OAR0110354 Schmidt Marine Technology Partners: #G-21-62431 NASA: #NNX17AB31G NSF OCE: #0838107 Gordon and Betty Moore Foundation: #9208 NDSEG: Graduate Fellowship MIT Martin Family Society of Fellows: Graduate Fellowship Microsoft: Graduate Research Fellowship DOE/National Nuclear Security Administration: #DE-NA000392 MIT EAPS: Houghton Fun
    corecore